The Effect of Executing Mispredicted Load Instructions in a Speculative Multithreaded Architecture

نویسندگان

  • Resit Sendag
  • Ying Chen
  • David J. Lilja
چکیده

Concurrent multithreaded architectures exploit both instructionlevel and thread-level parallelism in application programs. A single-threaded sequencing mechanism needs speculative execution beyond conditional branches in order to exploit more instruction-level parallelism. In addition, an aggressive multithreaded architecture should also use thread-level control speculation in order to exploit more thread-level parallelism. The instructionand thread-level speculative execution of load instructions in a multithreaded architecture system has a greater impact on the performance of the cache hierarchy as the design becomes more aggressive using wider issue processors and more thread units. In this study, we investigate the effects of executing the mispredicted load instructions on the cache performance of a scalable multithreaded computer system. The execution of loads down the wrongly predicted branch path within a thread unit or in a wrongly forked thread can result in an indirect prefetching effect for correct execution. This is possible even after the outcome of a control speculation is known. By allowing mispredicted load instructions to continue execution even after the instruction or thread level control speculation is known to have failed, we show that we can reduce the cache misses for the correctly predicted paths and threads. However, these additional loads also can increase the amount of memory traffic and can pollute the cache. Our results show that the performance of a concurrent multithreaded architecture can be improved as much as 14%, while reducing the number of L1 data cache misses up to 35%.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Using Incorrect Speculation to Prefetch Data in a Concurrent Multithreaded Processor

Concurrent multithreaded architectures exploit both instruction-level and thread-level parallelism through a combination of branch prediction and thread-level control speculation. The resulting speculative issuing of load instructions in these architectures can significantly impact the performance of the memory hierarchy as the system exploits higher degrees of parallelism. In this study, we in...

متن کامل

Reducing Misspeculation Penalty in Trace-Level Speculative Multithreaded Architectures

Trace-Level Speculative Multithreaded Processors exploit trace-level speculation by means of two threads working cooperatively. One thread, called the speculative thread, executes instructions ahead of the other by speculating on the result of several traces. The other thread executes speculated traces and verifies the speculation made by the first thread. Speculated traces are validated by ver...

متن کامل

A Non-blocking Multithreaded Architecture with Support for Speculative Threads

In this paper we provide both a qualitative and a quantitative evaluation of a decoupled multithreaded architecture that uses non-blocking threads. Our architecture is based on simple in-order pipelines and complete decoupling of memory accesses from execution pipelines. We extend the architecture to support thread level speculation using snooping cache coherency protocols. We evaluate the perf...

متن کامل

A Study of Mispredicted Branches Dependent on Load Misses in Continual Flow Pipelines

Large instruction window processors can achieve high performance by supplying more instructions during long latency load misses, thus effectively hiding these latencies. Continual Flow Pipeline (CFP) architectures provide high-performance by effectively increasing the number of actively executing instructions without increasing the size of the cycle-critical structures. A CFP consists of a Slic...

متن کامل

Speculative Precomputation

Current processors are based on a multithreaded architecture. Simultaneous Multithreading (SMT) techniques are used to increase instruction throughput under a multiprogramming workload; however, it does not improve performance when only a single thread is executing. This communication explores Speculative Precomputation, a technique that uses idle thread contexts in a multithreaded architecture...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2002